AITopics | Craiova

Collaborating Authors

Craiova

It's Hard to Be Normal: The Impact of Noise on Structure-agnostic Estimation

Jin, Jikai, Mackey, Lester, Syrgkanis, Vasilis

arXiv.org Machine LearningJul-11-2025

Structure-agnostic causal inference studies how well one can estimate a treatment effect given black-box machine learning estimates of nuisance functions (like the impact of confounders on treatment and outcomes). Here, we find that the answer depends in a surprising way on the distribution of the treatment noise. Focusing on the partially linear model of \citet{robinson1988root}, we first show that the widely adopted double machine learning (DML) estimator is minimax rate-optimal for Gaussian treatment noise, resolving an open problem of \citet{mackey2018orthogonal}. Meanwhile, for independent non-Gaussian treatment noise, we show that DML is always suboptimal by constructing new practical procedures with higher-order robustness to nuisance errors. These \emph{ACE} procedures use structure-agnostic cumulant estimators to achieve $r$-th order insensitivity to nuisance errors whenever the $(r+1)$-st treatment cumulant is non-zero. We complement these core results with novel minimax guarantees for binary treatments in the partially linear model. Finally, using synthetic demand estimation experiments, we demonstrate the practical benefits of our higher-order robust estimators.

artificial intelligence, estimator, machine learning, (16 more...)

arXiv.org Machine Learning

2507.02275

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > United Kingdom > England > Leicestershire > Loughborough (0.04)
Europe > Romania > Sud-Vest Oltenia Development Region > Dolj County > Craiova (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Probing Language Models on Their Knowledge Source

Tighidet, Zineddine, Mogini, Andrea, Mei, Jiali, Piwowarski, Benjamin, Gallinari, Patrick

arXiv.org Artificial IntelligenceNov-9-2024

Large Language Models (LLMs) often encounter conflicts between their learned, internal (parametric knowledge, PK) and external knowledge provided during inference (contextual knowledge, CK). Understanding how LLMs models prioritize one knowledge source over the other remains a challenge. In this paper, we propose a novel probing framework to explore the mechanisms governing the selection between PK and CK in LLMs. Using controlled prompts designed to contradict the model's PK, we demonstrate that specific model activations are indicative of the knowledge source employed. We evaluate this framework on various LLMs of different sizes and demonstrate that mid-layer activations, particularly those related to relations in the input, are crucial in predicting knowledge source selection, paving the way for more reliable models capable of handling knowledge conflicts effectively.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.05817

Country:

Europe > Croatia (0.14)
North America > United States > Virginia (0.05)
Europe > Italy (0.05)
(17 more...)

Genre: Research Report > New Finding (0.68)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models

Fang, Junfeng, Jiang, Houcheng, Wang, Kun, Ma, Yunshan, Wang, Xiang, He, Xiangnan, Chua, Tat-seng

arXiv.org Artificial IntelligenceOct-21-2024

Large language models (LLMs) often exhibit hallucinations due to incorrect or outdated knowledge. Hence, model editing methods have emerged to enable targeted knowledge updates. To achieve this, a prevailing paradigm is the locating-then-editing approach, which first locates influential parameters and then edits them by introducing a perturbation. While effective, current studies have demonstrated that this perturbation inevitably disrupt the originally preserved knowledge within LLMs, especially in sequential editing scenarios. To address this, we introduce AlphaEdit, a novel solution that projects perturbation onto the null space of the preserved knowledge before applying it to the parameters. We theoretically prove that this projection ensures the output of post-edited LLMs remains unchanged when queried about the preserved knowledge, thereby mitigating the issue of disruption. Extensive experiments on various LLMs, including LLaMA3, GPT2-XL, and GPT-J, show that AlphaEdit boosts the performance of most locating-then-editing methods by an average of 36.4% with a single line of additional code for projection solely. Our code is available at: https://github.com/jianghoucheng/AlphaEdit.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.02355

Country:

Europe > Spain > Galicia > Madrid (0.05)
Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)
Europe > Greece (0.04)
(10 more...)

Genre: Research Report > New Finding (1.00)

Industry: Transportation > Infrastructure & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reddit is all you need: Authorship profiling for Romanian

Ştefănescu, Ecaterina, Jerpelea, Alexandru-Iulius

arXiv.org Artificial IntelligenceOct-13-2024

Authorship profiling is the process of identifying an author's characteristics based on their writings. This centuries old problem has become more intriguing especially with recent developments in Natural Language Processing (NLP). In this paper, we introduce a corpus of short texts in the Romanian language, annotated with certain author characteristic keywords; to our knowledge, the first of its kind. In order to do this, we exploit a social media platform called Reddit. We leverage its thematic community-based structure (subreddits structure), which offers information about the author's background. We infer an user's demographic and some broad personal traits, such as age category, employment status, interests, and social orientation based on the subreddit and other cues. We thus obtain a 23k+ samples corpus, extracted from 100+ Romanian subreddits. We analyse our dataset, and finally, we fine-tune and evaluate Large Language Models (LLMs) to prove baselines capabilities for authorship profiling using the corpus, indicating the need for further research in the field. We publicly release all our resources.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.09907

Country:

Europe > Romania > Vest Development Region > Timiș County > Timișoara (0.05)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
Europe > Romania > Sud-Vest Oltenia Development Region > Dolj County > Craiova (0.04)
(14 more...)

Genre: Research Report (0.40)

Industry: Media > News (0.74)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)

Add feedback

Learning K-U-Net with constant complexity: An Application to time series forecasting

You, Jiang, Cela, Arben, Natowicz, René, Ouanounou, Jacob, Siarry, Patrick

arXiv.org Artificial IntelligenceOct-3-2024

Training deep models for time series forecasting is a critical task with an inherent challenge of time complexity. While current methods generally ensure linear time complexity, our observations on temporal redundancy show that high-level features are learned 98.44\% slower than low-level features. To address this issue, we introduce a new exponentially weighted stochastic gradient descent algorithm designed to achieve constant time complexity in deep learning models. We prove that the theoretical complexity of this learning method is constant. Evaluation of this method on Kernel U-Net (K-U-Net) on synthetic datasets shows a significant reduction in complexity while improving the accuracy of the test set.

complexity, gradient, time series forecasting, (13 more...)

arXiv.org Artificial Intelligence

2410.02438

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
Europe > Romania > Sud-Vest Oltenia Development Region > Dolj County > Craiova (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Enabling Intelligent Traffic Systems: A Deep Learning Method for Accurate Arabic License Plate Recognition

Sayedelahl, M. A.

arXiv.org Artificial IntelligenceAug-5-2024

This paper introduces a novel two-stage framework for accurate Egyptian Vehicle License Plate Recognition (EVLPR). The first stage employs image processing techniques to reliably localize license plates, while the second stage utilizes a custom-designed deep learning model for robust Arabic character recognition. The proposed system achieves a remarkable 99.3% accuracy on a diverse dataset, surpassing existing approaches. Its potential applications extend to intelligent traffic management, including traffic violation detection and parking optimization. Future research will focus on enhancing the system's capabilities through architectural refinements, expanded datasets, and addressing system dependencies.

character recognition, license plate, recognition, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.69888/FTSIN.2024.000156

2408.02904

Country:

Europe > Austria > Vienna (0.14)
Europe > Switzerland > Basel-City > Basel (0.04)
Oceania > Australia (0.04)
(8 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Transportation (0.49)
Media (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Expected Possession Value of Control and Duel Actions for Soccer Player's Skills Estimation

Shelopugin, Andrei

arXiv.org Artificial IntelligenceJun-2-2024

Estimation of football players' skills is one of the key tasks in sports analytics. This paper introduces multiple extensions to a widely used model, expected possession value (EPV), to address some key challenges such as selection problem. First, we assign greater weights to events occurring immediately prior to the shot rather than those preceding them (decay effect). Second, our model incorporates possession risk more accurately by considering the decay effect and effective playing time. Third, we integrate the assessment of individual player ability to win aerial and ground duels. Using the extended EPV model, we predict this metric for various football players for the upcoming season, particularly taking into account the strength of their opponents.

control action, duel, possession, (15 more...)

arXiv.org Artificial Intelligence

2406.00814

Country:

Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.05)
Europe > Netherlands (0.05)
Europe > Sweden > Skåne County > Malmö (0.04)
(30 more...)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.68)

Add feedback

Advancing AI with Integrity: Ethical Challenges and Solutions in Neural Machine Translation

Kimera, Richard, Kim, Yun-Seon, Choi, Heeyoul

arXiv.org Artificial IntelligenceApr-1-2024

This paper addresses the ethical challenges of Artificial Intelligence in Neural Machine Translation (NMT) systems, emphasizing the imperative for developers to ensure fairness and cultural sensitivity. We investigate the ethical competence of AI models in NMT, examining the Ethical considerations at each stage of NMT development, including data handling, privacy, data ownership, and consent. We identify and address ethical issues through empirical studies. These include employing Transformer models for Luganda-English translations and enhancing efficiency with sentence mini-batching. And complementary studies that refine data labeling techniques and fine-tune BERT and Longformer models for analyzing Luganda and English social media content. Our second approach is a literature review from databases such as Google Scholar and platforms like GitHub. Additionally, the paper probes the distribution of responsibility between AI systems and humans, underscoring the essential role of human oversight in upholding NMT ethical standards. Incorporating a biblical perspective, we discuss the societal impact of NMT and the broader ethical responsibilities of developers, positing them as stewards accountable for the societal repercussions of their creations.

dataset, integrity, translation, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.22724/LMR.2024.22.1.171

2404.0107

Country:

Europe > Romania > Sud-Vest Oltenia Development Region > Dolj County > Craiova (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Singapore (0.04)
Asia > China > Hong Kong (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Education (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Optical Fiber-Based Needle Shape Sensing in Real Tissue: Single Core vs. Multicore Approaches

Lezcano, Dimitri A., Zhetpissov, Yernar, Cheng, Alexandra, Kim, Jin Seob, Iordachita, Iulian I.

arXiv.org Artificial IntelligenceSep-8-2023

Flexible needle insertion procedures are common for minimally-invasive surgeries for diagnosing and treating prostate cancer. Bevel-tip needles provide physicians the capability to steer the needle during long insertions to avoid vital anatomical structures in the patient and reduce post-operative patient discomfort. To provide needle placement feedback to the physician, sensors are embedded into needles for determining the real-time 3D shape of the needle during operation without needing to visualize the needle intra-operatively. Through expansive research in fiber optics, a plethora of bio-compatible, MRI-compatible, optical shape-sensors have been developed to provide real-time shape feedback, such as single-core and multicore fiber Bragg gratings. In this paper, we directly compare single-core fiber-based and multicore fiber-based needle shape-sensing through identically constructed, four-active area sensorized bevel-tip needles inserted into phantom and \exvivo tissue on the same experimental platform. In this work, we found that for shape-sensing in phantom tissue, the two needles performed identically with a $p$-value of $0.164 > 0.05$, but in \exvivo real tissue, the single-core fiber sensorized needle significantly outperformed the multicore fiber configuration with a $p$-value of $0.0005 < 0.05$. This paper also presents the experimental platform and method for directly comparing these optical shape sensors for the needle shape-sensing task, as well as provides direction, insight and required considerations for future work in constructively optimizing sensorized needles.

curvature, mcf needle, sensor, (11 more...)

arXiv.org Artificial Intelligence

2309.04407

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Maryland > Baltimore (0.04)
Europe > Romania > Sud-Vest Oltenia Development Region > Dolj County > Craiova (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.46)
Health & Medicine > Therapeutic Area > Urology (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Design and Fabrication of a Fiber Bragg Grating Shape Sensor for Shape Reconstruction of a Continuum Manipulator

Amirkhani, Golchehr, Goodridge, Anna, Esfandiari, Mojtaba, Phalen, Henry, Ma, Justin H., Iordachita, Iulian, Armand, Mehran

arXiv.org Artificial IntelligenceMar-6-2023

Continuum dexterous manipulators (CDMs) are suitable for performing tasks in a constrained environment due to their high dexterity and maneuverability. Despite the inherent advantages of CDMs in minimally invasive surgery, real-time control of CDMs' shape during non-constant curvature bending is still challenging. This study presents a novel approach for the design and fabrication of a large deflection fiber Bragg grating (FBG) shape sensor embedded within the lumens inside the walls of a CDM with a large instrument channel. The shape sensor consisted of two fibers, each with three FBG nodes. A shape-sensing model was introduced to reconstruct the centerline of the CDM based on FBG wavelengths. Different experiments, including shape sensor tests and CDM shape reconstruction tests, were conducted to assess the overall accuracy of the shape sensing. The FBG sensor evaluation results revealed the linear curvature-wavelength relationship with the large curvature detection of 0.045 mm at a 90 degrees bending angle and a sensitivity of up to 5.50 nm/mm in each bending direction. The CDM's shape reconstruction experiments in a free environment demonstrated the shape tracking accuracy of 0.216+-0.126 mm for positive/negative deflections. Also, the CDM shape reconstruction error for three cases of bending with obstacles were observed to be 0.436+-0.370 mm for the proximal case, 0.485+-0.418 mm for the middle case, and 0.312+-0.261 mm for the distal case. This study indicates the adequate performance of the FBG sensor and the effectiveness of the model for tracking the shape of the large-deflection CDM with nonconstant-curvature bending for minimally-invasive orthopaedic applications.

artificial intelligence, cdm, centerline, (16 more...)

arXiv.org Artificial Intelligence

2303.03613

Country:

North America > United States > Maryland > Baltimore (0.05)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
Europe > Switzerland (0.04)
(4 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Therapeutic Area > Orthopedics/Orthopedic Surgery (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (0.98)
Information Technology > Sensing and Signal Processing (0.93)

Add feedback